Developing Deep Learning Pipeline of Whole-Slide Images for Enhanced Diffuse Large B Cell Lymphoma (DLBCL) Subtyping and Outcome Prediction: Leveraging Self-Attention Transformer for Training and Inference

Zhou, Chen; Xu, Jie; Prakash, Rishab; Torres-Cabala, Carlos P; Chen, Cheng-bang; Madduri, Kamesh; Rao, Arvind; Agasthya, Greeshma; Vega, Francisco; O'Malley, Dennis; Medeiros, L. Jeffrey; Kumara, Soundar; Iyer, Swami P.

doi:10.1182/blood-2023-187291

Chen Zhou,

Chen Zhou

1Penn State University, State College, PA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Jie Xu,

Jie Xu

2Department of Hematopathology, The University of Texas M.D. Anderson Cancer Center, Houston, TX

Search for other works by this author on:

This Site

PubMed

Google Scholar

Rishab Prakash,

Rishab Prakash

3Department of Lymphoma and Myeloma, The University of Texas MD Anderson Cancer Center, Houston, TX

Search for other works by this author on:

This Site

PubMed

Google Scholar

Carlos P Torres-Cabala,

Carlos P Torres-Cabala

4Dermatopathology, University of Texas MD Anderson, Houston, TX

Search for other works by this author on:

This Site

PubMed

Google Scholar

Cheng-bang Chen,

Cheng-bang Chen

5University of Miami, Coral Gables, FL

Search for other works by this author on:

This Site

PubMed

Google Scholar

Kamesh Madduri,

Kamesh Madduri

1Penn State University, State College, PA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Arvind Rao,

Arvind Rao

6University of Michigan, Ann Arbor, MI

Search for other works by this author on:

This Site

PubMed

Google Scholar

Greeshma Agasthya,

Greeshma Agasthya

7Oak Ridge National Laboratory, Oak Ridge, TN

Search for other works by this author on:

This Site

PubMed

Google Scholar

Francisco Vega,

Francisco Vega

8Department of Hematopathology, MD Anderson Cancer Center , Houston, TX

Search for other works by this author on:

This Site

PubMed

Google Scholar

Dennis O'Malley,

Dennis O'Malley

9NeoGenomics Laboratories, Inc., Aliso Viejo, CA

Search for other works by this author on:

This Site

PubMed

Google Scholar

L. Jeffrey Medeiros,

L. Jeffrey Medeiros

10Hematopathology, The University of Texas MD Anderson Cancer Center, Houston, TX

Search for other works by this author on:

This Site

PubMed

Google Scholar

Soundar Kumara,

Soundar Kumara

1Penn State University, State College, PA

Search for other works by this author on:

This Site

PubMed

Google Scholar

Swami P. Iyer

11Lymphoma and Myeloma, The University of Texas MD Anderson Cancer Center, Houston, TX

Search for other works by this author on:

This Site

PubMed

Google Scholar

Background: Deep learning algorithms can help to analyze whole-slide images (WSI) in lymphoma pathology, identifying deeper features and patterns that may not be easily discernible to human observers. This pilot project is focused on diffuse large B-cell lymphoma (DLBCL), a heterogeneous disease with diverse genetic alterations. By leveraging self-attention trained clusters and transformers, the project aims to identify patterns and associations between mutational status and overall survival, potentially enhancing personalized treatment strategies and improving data reliability.

Methods: This study employed a computer vision pipeline learning system to classify DLBCL subtypes using self-discovery of discriminatory features from scanned WSI. The workflow is shown in Figure 1a. First, we segmented tiles or patches from the gigapixel-sized WSI of 223 lymphoma cancer biopsy slides sourced from The Cancer Genome Atlas (TCGA), DLBCL, and Stanford DLBCL-Morph datasets. to optimize extraction of relevant features. For feature extraction, we utilized self-supervised pretraining with a Vision Transformer (ViT) network on a dataset of 1,515,000 patches. These patches were grouped into morphologically similar clusters using K-means, representing various lymphoma proliferation patterns. Dimensionality reduction with UMAP allowed for computational efficiency and feature visualization. Extracted features were used to predict overall survival using Bag-of-Words (BoWs). We further enhanced the model by incorporating geometric information, including nuclear characteristics from HoverNet. We assessed the model's performance using key metrics such as area under the curve (AUC) and accuracy. Additionally, genomic mutations with a frequency of 10% or higher from the TCGA were incorporated for enhanced predictive capability, including PIM1, SGK1, CARD11, KMT2D, SOCS1, BTG1, MUC16 and FAT4. Correlations and hierarchical clustering were performed on mutations, patches, and outcomes (Figure 1b).

Results: Using the self-trained ViT encoder as the backbone and random forest, we demonstrate an accuracy of 0.88 on a weakly supervised task with all samples. The performance is significantly improved when compared to ResNet-50 trained on ImageNet. In addition, saliency maps from multi-attention heads provide excellent interpretations of morphological characteristics, including tumor stroma, cell location and necrosis. With the feature embeddings extracted by ViT, 10 morphologically distinct clusters were identified. The algorithms identified morphologically similar and dissimilar clusters among tiles that represent variation in the distribution of lymphoma cells as seen in clustering results. There was a strong correlation between clusters one and four, the mutated genes, and an increased probability of poor outcomes. When this was visualized by the pathologist, cluster 4 showed a high level of necrosis in the WSI. Finally, using the segmented WSI cortex with clustering distribution, our method achieves an AUC of 0.77 for predicting the vital status of alive/dead as an outcome.

Conclusions: Our pipeline effectively leverages WSI to employ machine learning and deep learning tools for disease classification and outcome prediction in DLBCL. By computationally analyzing the entire tumor landscape, it captures tumor heterogeneity and disease risk, establishing correlations between patch-level characteristics, genomic mutations and overall outcomes. The Vision Transformer (ViT) plays a pivotal role in this process, successfully identifying specific features in histopathology tissue by leveraging self-attention mechanisms, which enables precise feature extraction and accurate analysis for DLBCL classification. Notably, the ViT model achieves high performance across various tasks without the need for external labeled data, owing to its self-supervised learning capability that enables independent learning and extraction of meaningful information. Based on this discovery cohort from this dataset, we will update the analysis using a larger, independent external cohort from institutional archives, and the results will be presented at the annual meeting.

Disclosures

Vega:Allogene: Research Funding; Geron: Research Funding. O'Malley:NeoGenomics Laboratories: Current Employment. Iyer:Astra Zeneca: Research Funding; Drenbio: Research Funding; Acrotech: Consultancy, Research Funding; Innate: Research Funding; Legend: Research Funding; Ono: Research Funding; Pfizer: Research Funding; Seagen: Consultancy, Research Funding; American Society of Hematology: Speakers Bureau; American Society of Transplant and Cellular Therapy: Speakers Bureau; CuraBio: Speakers Bureau; Yingli: Consultancy, Research Funding; Merck: Research Funding; Salarius: Consultancy; CRISPR: Consultancy, Research Funding.

View large Download slide

Figure 1

This content is only available as a PDF.

2023

Developing Deep Learning Pipeline of Whole-Slide Images for Enhanced Diffuse Large B Cell Lymphoma (DLBCL) Subtyping and Outcome Prediction: Leveraging Self-Attention Transformer for Training and Inference

Disclosures

Contents

Data & Figures

Supplemental data

References

Cited By

Email alerts

ASH Publications

American Society of Hematology

Developing Deep Learning Pipeline of Whole-Slide Images for Enhanced Diffuse Large B Cell Lymphoma (DLBCL) Subtyping and Outcome Prediction: Leveraging Self-Attention Transformer for Training and Inference Free

Disclosures

Contents

Data & Figures

Supplemental data

References

Related

Related

Cited By

Email alerts

ASH Publications

American Society of Hematology

This Feature Is Available To Subscribers Only

Developing Deep Learning Pipeline of Whole-Slide Images for Enhanced Diffuse Large B Cell Lymphoma (DLBCL) Subtyping and Outcome Prediction: Leveraging Self-Attention Transformer for Training and Inference